The Mit Alewife Machine: a Large-scale Distributed-memory Multiprocessor
نویسندگان
چکیده
The Alewife multiprocessor project focuses on the architecture and design of a large-scale parallel machine. The machine uses a low dimension direct interconnection network to provide scalable communication band-width, while allowing the exploitation of locality. Despite its distributed memory architecture, Alewife allows eecient shared memory programming through a multilayered approach to locality management. A new scalable cache coherence scheme called LimitLESS directories allows the use of caches for reducing communication latency and network bandwidth requirements. Alewife also employs run-time and compile-time methods for partitioning and placement of data and processes to enhance communication locality. While the above methods attempt to minimize communication latency, remote communication with distant processors cannot be completely avoided. Alewife's processor, Sparcle, is designed to tolerate these latencies by rapidly switching between threads of computation. This paper describes the Alewife architecture and concentrates on the novel hardware features of the machine including LimitLESS directories and the rapid context switching processor.
منابع مشابه
Latency Tolerance through Multithreading in Large-Scale Multiprocessors
In large-scale distributed-memory multiprocessors, remote memory accesses su er signi cant latencies. Caches help alleviate the memory latency problem by maintaining local copies of frequently used data. However, they cannot eliminate the latency caused by rst-time references and invalidations needed to enforce cache coherence. Multithreaded processors tolerate such latencies by rapidly switchi...
متن کاملHigh-performance all-software distributed shared memory
The C Region Library (CRL) is a new all-software distributed shared memory (DSM) system. CRL requires no special compiler, hardware, or operating system support beyond the ability to send and receive messages between processing nodes. It provides a simple, portable, region-based shared address space programming model that is capable of delivering good performance on a wide range of multiprocess...
متن کاملA Retrospective on The MIT Alewife Machine: Architecture and Performance
The MIT Alewife project evolved out of exploratory work at Stanford on directory schemes for cache coherence [1] (also included in this issue). Using data from small bus-based multiprocessors, this early work demonstrated that directory schemes were as efficient as bus-based snooping protocols, and that by distributing directories along with main memory, they could provide the foundations for a...
متن کاملRun-time thread management for large-scale distributed-memory multiprocessors
E ective thread management is crucial to achieving good performance on large-scale distributed-memory multiprocessors that support dynamic threads. For a given parallel computation with some associated task graph, a thread-management algorithm produces a running schedule as output, subject to the precedence constraints imposed by the task graph and the constraints imposed by the interprocessor ...
متن کاملMechanisms and interfaces for software-extended coherent shared memory
Software-extended systems use a combination of hardware and software to implement shared memory on large-scale multiprocessors. Hardware mechanisms accelerate common-case accesses, while software handles exceptional events. In order to provide fast memory access, this design strategy requires appropriate hardware mechanisms including caches, location-independent addressing, limited directories,...
متن کامل